-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replace YulString with YulName typedef #15083
Conversation
fdb537b
to
1e91d0e
Compare
What's this about? Is this regarding the Yul AST rework to get rid of the AST ID dependencies (or to at least strengthen them) in order then get rid of codegen non-determinism? Or is it something else? |
The idea is to get rid of YulString as a kind of a leightweight string proxy in favor of a simple incremental uint64 index for handles (with a backing string vector of names) and do the string generation of derived names post-hoc |
Alright, so killing the non-determinism in its root. |
Depends on the level at which you're looking - but sure, restarting the optimization with an intermediate will kill determinism with that approach unless we do reordering after each step (which may not be so bad, let's see) |
Hi @bshastry, as this changes some stuff in the fuzzing department as well, does the PR mess things up on your end or is it fine? :) |
libyul/AST.h
Outdated
using TypedNameList = std::vector<TypedName>; | ||
|
||
/// Literal number or string (up to 32 bytes) | ||
enum class LiteralKind { Number, Boolean, String }; | ||
struct Literal { langutil::DebugData::ConstPtr debugData; LiteralKind kind; YulString value; Type type; }; | ||
struct Literal { langutil::DebugData::ConstPtr debugData; LiteralKind kind; YulName value; Type type; }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR seems to be just replacing all instances of YulString
with YulName
. That's not very useful. The idea was to rename only in those cases where YulString
is used for identifiers. Those are the cases which we'd want to replace with numeric IDs and we'll do this by switching YulName
to be some other type.
Literal
is a good example of a situation where this must not happen. If you have a string "xyz"
, in the code, it must remain as "xyz"
in the AST. It can't be replaced with some number that we'll only later remap back to the original value.
I think we don't want this for TypedName
either. Types are defined in the dialect so renaming then just in the AST will make us unable to find the right type definition.
BTW, TypeName
is probably a legacy thing at this point and maybe we'll remove it eventually. It is used in the typed Yul dialect, which was an intermediate step in the process of translating the code to Ewasm. Ewasm was dropped and I'm not sure if we'll have any other use for it. In the untyped dialect Type
will always be just the default type.
As for FunctionDefinition
, for that one using YulName
seems correct, though I'm surprised it doesn't use a whole Identifier
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was under the impression you'd wanted to get rid of YulString
altogether, my bad then.
As for Literal
and TypedName
: How would replacing a string "xyz"
with some identifier and later remapping it be conceptually different from using YulString
, which replaces the string "xyz"
with a handle (id/hash) to later remap it?
@@ -59,7 +60,7 @@ struct VariableDeclaration { langutil::DebugData::ConstPtr debugData; TypedNameL | |||
/// Block that creates a scope (frees declared stack variables) | |||
struct Block { langutil::DebugData::ConstPtr debugData; std::vector<Statement> statements; }; | |||
/// Function definition ("function f(a, b) -> (d, e) { ... }") | |||
struct FunctionDefinition { langutil::DebugData::ConstPtr debugData; YulString name; TypedNameList parameters; TypedNameList returnVariables; Block body; }; | |||
struct FunctionDefinition { langutil::DebugData::ConstPtr debugData; YulName name; TypedNameList parameters; TypedNameList returnVariables; Block body; }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may have a small problem here. With this change, user-defined functions will use YulName
, while the built-in functions in the dialect will still be identified with YulString
. Since the latter by design have fixed names, I'm not sure it's even feasible to use YulName
for them. But then FunctionCall
node will not be able to refer to them.
{ | ||
|
||
class YulString; | ||
using YulName = YulString; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you're planning to add something more here, like utilities (or maybe make YulName
into a class in the future), it would be enough to put it in AST.h
. We'll need it pretty much only in Yul AST. In fact, if there is some situation where it does not seem appropriate to include the whole AST.h
, it could be a sign that YulString
there is not really used for identifiers.
Sorry for the late reply :-( Since the ossfuzz build passed, I would assume it is fine with the fuzzer as well. Dynamic breakages (at runtime) need to be seen though, let me know if you want me to run the fuzzer on this PR. That should throw up issues if any! |
Cheers, no worries. I expect this PR to be closed anyways and there's going to be a new one with a different set of changes. That will also impact the fuzzer, most likely. I'll let you know :) |
This pull request is stale because it has been open for 14 days with no activity. |
This pull request is stale because it has been open for 14 days with no activity. |
This pull request is stale because it has been open for 14 days with no activity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through all the substitutions here and, surprisingly, there's actually very little use of YulString
for things that are not identifiers. The main case were the values of literals, but that has already been eliminated by #15112.
One case where I'm not 100% sure is the tracking of memory, storage and keccak values in DataflowAnalyser
. I think YulString
is only used to refer to variables there but without taking a deeper look, I'm not confident enough to say that we never put actual values in it.
There are a few cases, where it's kinda an identifier, but cannot just be freely replaced with a different one:
- Use for qualified Yul object and data names - these contain dots that are really a part of syntax. This should probably be replaced by
std::string
or vectors ofYulName
. - The
"m"_yulstring
and"s"_yulstring
placeholders inUnusedStoreEliminator.h
have special meaning. - Empty
YulString
often has a special meaning (e.g. inKnowledgeBase.h
).
Other than these, I see YulString
used only to name functions, variables, types and Yul objects.
As such, this change would actually be mostly fine, though IIRC last time we talked about it, we said that we won't be merging it.
If we do decide to proceed with it, the _yulstring
suffix should be renamed to _yulname
as well and the whole YulString
class should be removed since it looks like we won't need it for anything other than names.
In #15215 I have done the following to address the points you brought up:
I am in the process of breaking down that way too big PR into smaller (or if not small, at least hopefully trivial) ones |
1e91d0e
to
ead0291
Compare
5ae46a2
to
46cf485
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine overall, just has a few places that mangle line wrapping for some reason.
Also, I think we may have been better off just renaming YulString
in place, but since follow-up PRs are starting to pile up and I'm not sure if it wouldn't cascade into some more changes, I'm fine with merging this regardless.
589ee29
to
ba42473
Compare
Thanks for the feedback! I think my editor was playing some tricks on me with the formatting:) |
ba42473
to
96fdcc3
Compare
in preparation of type change of YulString s.t. these changes are more localized